Simulations with Realistic Sensor-Fusion Detection Estimates of Objects

ABSTRACT

A method is implemented by a processing system with at least one computer processor. The method includes obtaining a visualization of a scene that includes a template of a simulation object within a region. The method includes generating a sensor-fusion representation of the template upon receiving the visualization as input. The method includes generating a simulation of the scene with a sensor-fusion detection estimate of the simulation object instead of the template within the region. The sensor-fusion detection estimate includes object contour data indicating bounds of the sensor-fusion representation. The sensor-fusion detection estimate represents the bounds or shape of an object as would be detected by a sensor-fusion system.

FIELD OF THE INVENTION

This disclosure relates generally to generating realistic sensor-fusiondetection estimates of objects.

BACKGROUND

In general, there are a lot of challenges to developing an autonomous orsemi-autonomous vehicle. To assist with its development, the autonomousor semi-autonomous vehicle often undergoes numerous tests based onvarious scenarios. In this regard, simulations are often used in manyinstances since they are more cost effective to perform than actualdriving tests. However, there are many instances in which simulations donot accurately represent real use-cases. For example, in some cases,some simulated camera images may look more like video game images thanactual camera images. In addition, there are some types of sensors,which produce sensor data that is difficult and costly to simulate. Forexample, radar detections are known to be difficult to simulate withaccuracy. As such, simulations with these types of inaccuracies may notprovide the proper conditions for the development, testing, andevaluation of autonomous and semi-autonomous vehicles.

SUMMARY

The following is a summary of certain embodiments described in detailbelow. The described aspects are presented merely to provide the readerwith a brief summary of these certain embodiments and the description ofthese aspects is not intended to limit the scope of this disclosure.Indeed, this disclosure may encompass a variety of aspects that may notbe explicitly set forth below.

In an example embodiment, a system for generating a realistic simulationincludes at least a non-transitory computer readable medium and aprocessing system. The non-transitory computer readable medium includesa visualization of a scene that includes a template of a simulationobject within a region. The processing system is communicativelyconnected to the non-transitory computer readable medium. The processingsystem includes at least one processing device, which is configured toexecute computer-readable data to implement a method that includesgenerating a sensor-fusion representation of the template upon receivingthe visualization as input. The method includes generating a simulationof the scene with a sensor-fusion detection estimate of the simulationobject instead of the template within the region. The sensor-fusiondetection estimate includes object contour data indicating bounds of thesensor-fusion representation. The sensor-fusion detection estimaterepresents the bounds or shape of an object as would be detected by asensor-fusion system.

In an example embodiment, a computer-implemented method includesobtaining, via a processing system with at least one computer processor,a visualization of a scene that includes a template of a simulationobject within a region. The method includes generating, via theprocessing system, a sensor-fusion representation of the template uponreceiving the visualization as input. The method includes generating,via the processing system, a simulation of the scene with asensor-fusion detection estimate of the simulation object instead of thetemplate within the region. The sensor-fusion detection estimateincludes object contour data indicating bounds of the sensor-fusionrepresentation. The sensor-fusion detection estimate represents thebounds or shape of an object as would be detected by a sensor-fusionsystem.

In an example embodiment, a non-transitory computer readable mediumincludes computer-readable data that, when executed by a computerprocessor, is configured to implement a method. The method includesobtaining a visualization of a scene that includes a template of asimulation object within a region. The method includes generating asensor-fusion representation of the template upon receiving thevisualization as input. The method includes generating a simulation ofthe scene with a sensor-fusion detection estimate of the simulationobject instead of the template within the region. The sensor-fusiondetection estimate includes object contour data indicating bounds of thesensor-fusion representation. The sensor-fusion detection estimaterepresents the bounds or shape of an object as would be detected by asensor-fusion system.

These and other features, aspects, and advantages of the presentinvention are discussed in the following detailed description inaccordance with the accompanying drawings throughout which likecharacters represent similar or like parts.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a conceptual diagram of a non-limiting example of a simulationsystem according to an example embodiment of this disclosure.

FIG. 2 is a conceptual flowchart of a process for developing amachine-learning model for the simulation system of FIG. 1 according toan example embodiment of this disclosure.

FIG. 3 is an example of a method for training the machine learning modelof FIG. 2 according to an example embodiment of this disclosure.

FIG. 4 is an example of a method for generating simulations withrealistic sensor-fusion detection estimates of objects according to anexample embodiment of this disclosure.

FIG. 5A is a conceptual diagram of a single object in relation tosensors according to an example embodiment of this disclosure.

FIG. 5B is a diagram of a sensor-fusion detection of the object of FIG.5A according to an example embodiment of this disclosure.

FIG. 6A is a conceptual diagram of multiple objects in relation to atleast one sensor according to an example embodiment of this disclosure.

FIG. 6B is a diagram of a sensor-fusion detection based on the multipleobjects of FIG. 6A according to an example embodiment of thisdisclosure.

FIG. 7 is a diagram that shows a superimposition of various datarelating to objects of a geographic region according to an exampleembodiment of this disclosure.

FIG. 8A is a diagram of a non-limiting example of a scene with objectsaccording to an example embodiment of this disclosure.

FIG. 8B is a diagram of a non-limiting example of the scene of FIG. 8Awith sensor-based data in place of the objects according to an exampleembodiment of this disclosure.

DETAILED DESCRIPTION

The embodiments described herein, which have been shown and described byway of example, and many of their advantages will be understood by theforegoing description, and it will be apparent that various changes canbe made in the form, construction, and arrangement of the componentswithout departing from the disclosed subject matter or withoutsacrificing one or more of its advantages. Indeed, the described formsof these embodiments are merely explanatory. These embodiments aresusceptible to various modifications and alternative forms, and thefollowing claims are intended to encompass and include such changes andnot be limited to the particular forms disclosed, but rather to coverall modifications, equivalents, and alternatives falling within thespirit and scope of this disclosure.

FIG. 1 is a conceptual diagram of an example of a simulation system 100,which is configured to generate simulations with realistic sensor-fusiondetection estimates. In an example embodiment, the simulation system 100has a processing system 110, which includes at least one processor. Inthis regard, for example, the processing system 110 includes at least acentral processing unit, (CPU), a graphics processing unit (GPU), anapplication-specific integrated circuit (ASIC), any suitable processingdevice, hardware technology, or any combination thereof. In an exampleembodiment, the processing system 110 is configured to perform a varietyof functions, as described herein, such that simulations with realisticsensor-fusion detection estimates are generated and transmitted to anysuitable application system 10.

In an example embodiment, the simulation system 100 includes a memorysystem 120, which comprises any suitable memory configuration thatincludes at least one non-transitory computer readable medium. Forexample, the memory system 120 includes semiconductor memory, randomaccess memory (RAM), read only memory (ROM), virtual memory, electronicstorage devices, optical storage devices, magnetic storage devices,memory circuits, any suitable memory technology, or any combinationthereof. The memory system 120 is configured to include local, remote,or both local and remote components with respect to the simulationsystem 100. The memory system 120 stores various computer readable data.For example, in FIG. 1, the computer readable data includes at leastprogram instructions, simulation data, machine-learning data (e.g.,neural network data), sensor-fusion detection estimates, simulations, orany combination thereof. Also, in an example embodiment, the memorysystem 120 includes other relevant data, which relates to thefunctionalities described herein. In general, the memory system 120 isconfigured to provide the processing system 110 with access to variouscomputer readable data such that the processing system 110 is enabled toat least generate various simulations of various scenarios in variousenvironmental regions that include realistic sensor-fusion detectionestimates of objects. These realistic simulations are then transmittedto and executed by one or more components of the application system 10.

In an example embodiment, the simulation system 100 also includes atleast a communication network 130, an input/output interface 140, andother functional modules. The communication network 130 is configured toenable communications between and/or among one or more components of thesimulation system 100. The communication network 130 includes wiredtechnology, wireless technology, any suitable communication technology,or any combination thereof. For example, the communication network 130enables the processing system 110 to communicate with the memory system120 and the input/output interface 140. The input/output interface 140is configured to enable communication between one or more components ofthe simulation system 100 and one or more components of the applicationsystem 10. For example, in FIG. 1, the input/output interface 140 isconfigured to provide an interface that enables simulations withrealistic sensor-fusion detection estimates to be output to the vehicleprocessing system 30 via a communication link 150. In an exampleembodiment, the communication link 150 is any suitable communicationtechnology that enables data communication between the simulation system100 and the application system 10. Additionally, although not shown inFIG. 1, the simulation system 100 is configured to include otherfunctional components (e.g., operating system, etc.), which includecomputer components that are known and not described herein.

In an example embodiment, the application system 10 is configured toreceive realistic simulations from the simulation system 100. In anexample embodiment, for instance, the application system 10 relates to avehicle 20, which is autonomous, semi-autonomous, or highly-autonomous.Alternatively, the simulations can be applied to a non-autonomousvehicle. For example, in FIG. 1, the simulation system 100 providessimulations to one or more components of a vehicle processing system 30of the vehicle 20. Non-limiting examples of one or more components ofthe vehicle processing system 30 include a trajectory system, a motioncontrol system, a route-planning system, a prediction system, anavigation system, any suitable system, or any combination thereof.Advantageously, with these simulations, the vehicle 20 is provided withrealistic input data without having to go on real-world drives, therebyleading to cost-effective development and evaluation of one or morecomponents of the vehicle processing system 30.

FIG. 2 is a conceptual flowchart of a process 200 involved in developingmachine-learning data (e.g., neural network data with at least oneneural network model) such that the processing system 110 is configuredto generate realistic sensor-fusion detection estimates of objectsaccording to an example embodiment. The process 200 ensures that themachine-learning model is trained with a sufficient amount of propertraining data. In this case, as shown in FIG. 2, the training dataincludes real-world sensor-fusion detections and their correspondingannotations. In an example embodiment, the training data is based oncollected data, which is harvested via a data collection process 210that includes a sufficiently large amount of data collections.

In an example embodiment, the data collection process 210 includesobtaining and storing a vast amount of collected data from thereal-world. More specifically, for instance, the data collection process210 includes collecting sensor-based data (e.g., sensor data,sensor-fusion data, etc.) via various sensing devices that are providedon various mobile machines during various real-world drives. In thisregard, for example, FIG. 2 illustrates a non-limiting example of avehicle 220, which is configured to harvest sensor-based data from thereal-world and provide a version of this collected data to the memorysystem 230. In this example, the vehicle 220 includes at least onesensor system with various sensors 220A to detect an environment of thevehicle 220. In this case, the sensor system includes ‘n’ number ofsensors 220A, where ‘n’ represents an integer number greater than 2.Non-limiting examples of the various sensors 220A include a lightdetection and ranging (LIDAR) sensor, a camera system, a radar system,an infrared system, a satellite-based sensor system (e.g., globalnavigation satellite system (GNSS), global positioning satellite (GPS),etc.), any suitable sensor, or any combination thereof.

In an example embodiment, the vehicle 220 includes a vehicle processingsystem 220B with non-transitory computer-readable memory. Thecomputer-readable memory is configured to store variouscomputer-readable data including program instructions, sensor-based data(e.g., raw sensor data, sensor-fusion data, etc.), and other relateddata (e.g., map data, localization data, etc.). The other related dataprovides relevant information (e.g., context) regarding the sensor-baseddata. In an example embodiment, the vehicle processing system 220B isconfigured to process the raw sensor data and the other related data.Additionally or alternatively, the processing system 220B is configuredto generate sensor-fusion data based on the processing of the raw sensordata and the other related data. After obtaining this sensor-based dataand other related data, the processing system 220B is configured totransmit or transfer a version of this collected data from the vehicle220 to the memory system 230 via communication technology, whichincludes wired technology, wireless technology, or both wired andwireless technology.

In an example embodiment, the data collection process 210 is not limitedto this data collection technique involving vehicle 220, but can includeother data gathering techniques that provide suitable real-worldsensor-based data. In addition, the data collection process 210 includescollecting other related data (e.g. map data, localization data, etc.),which corresponds to the sensor-based data that is collected from thevehicles 220. In this regard, for example, the other related data isadvantageous in providing context and/or further details regarding thesensor-based data.

In an example embodiment, the memory system 230 is configured to storethe collected data in one or more non-transitory computer readablemedia, which includes any suitable memory technology in any suitableconfiguration. For example, the memory system 230 includes semiconductormemory, RAM, ROM, virtual memory, electronic storage devices, opticalstorage devices, magnetic storage devices, memory circuits, cloudstorage system, any suitable memory technology, or any combinationthereof. For instance, in an example embodiment, the memory system 230includes at least non-transitory computer readable media in at least acomputer cluster configuration.

In an example embodiment, after this collected data has been stored inthe memory system 230, then the process 200 includes ensuring that aprocessing system 240 trains the machine-learning model with appropriatetraining data, which is based on this collected data. In an exampleembodiment, the processing system 240 includes at least one processor(e.g., CPU, GPU, processing circuits, etc.) with one or more modules,which include hardware, software, or a combination of hardware andsoftware technology. For example, in FIG. 2, the processing system 240contains one or more processors along with software, which include atleast a pre-processing module 240A and a processing module 240B. In thiscase, the processing system 240 executes program instructions, which arestored in the memory system 230, the processing system 240 itself (vialocal memory), or both the memory system 230 and the processing system240.

In an example embodiment, upon obtaining the collected data, thepre-processing module 240A is configured to provide suitable trainingdata for the machine-learning model. In FIG. 2, for instance, thepre-processing module 240A is configured to generate sensor-fusiondetections upon obtaining the sensor-based data as input. Morespecifically, for example, upon receiving raw sensor data, thepre-processing module 240A is configured to generate sensor-fusion databased on this raw sensor data from the sensors of the vehicle 220. Inthis regard, for example, the sensor-fusion data refers to a fusion ofsensor data from various sensors, which are sensing an environment at agiven instance. In an example embodiment, the method is independent ofthe type of fusion approach and is implementable with early fusionand/or late fusion. The generation of sensor-fusion data isadvantageous, as a view based on a combination of sensor data fromvarious sensors is more complete and reliable than a view based onsensor data from an individual sensor. Upon generating or obtaining thissensor-fusion data the pre-processing module 240A is configured toidentify sensor-fusion data that corresponds to an object. In addition,the pre-processing module 240A is configured to generate a sensor-fusiondetection, which includes a representation of the general bounds ofsensor-fusion data that relates to that identified object. With thispre-processing, the processing module 240B is enabled to handle thesesensor-fusion detections, which identify objects, with greater ease andquickness compared to unbounded sensor-fusion data, which correspond tothose same objects.

In an example embodiment, the processing module 240B is configured totrain at least one machine-learning model to generate sensor-fusiondetection estimates for objects based on real-world training dataaccording to real-use cases. In FIG. 2, for instance, the processingmodule 240B is configured to train the machine-learning model togenerate sensor-fusion detection estimates for the objects based ontraining data, which includes real-world sensor-fusion detectionstogether with corresponding annotations. More specifically, upongenerating the real-world sensor-fusion detections, the process 200includes an annotation process 250. The annotation process 250 includesobtaining annotations, which are objective and valid labels thatidentify these sensor-fusion detections in relation to the objects thatthey represent. In an example embodiment, for instance, the annotationsare provided by annotators, such as skilled humans (or any reliable andverifiable technological means). More specifically, these annotatorsprovide labels for identified sensor-fusion detections of objects (e.g.,building, tree, pedestrian, signs, lane-markings) among thesensor-fusion data. In addition, the annotators are enabled to identifysensor-fusion data that correspond to objects, generate sensor-fusiondetections for these objects, and provide labels for these sensor-fusiondetections. These annotations are stored with their correspondingsensor-fusion detections of objects as training data in the memorysystem 230. With this training data, the processing module 240B isconfigured to optimize a machine-learning architecture, its parameters,and its weights for a given task.

In an example embodiment, the processing module 240B is configured totrain machine-learning technology (e.g., machine-learning algorithms) togenerate sensor-fusion detection estimates for objects in response toreceiving object data for these objects. In this regard, for example,the memory system 230 includes machine-learning data such as neuralnetwork data. More specifically, in an example embodiment, for instance,the machine-learning data includes a generative adversarial network(GAN). In an example embodiment, the processing module 240B isconfigured to train the GAN model to generate new objects based ondifferent inputs. For example, the GAN is configured to transform onetype of image (e.g., a visualization, a computer graphics-based image,etc.) into another type of image (e.g., a real-looking image such as asensor-based image). The GAN is configured to modify at least parts ofan image. As a non-limiting example, for instance, the GAN is configuredto transform or replace one or more parts (e.g., extracted object data)of an image with one or more items (e.g., sensor-fusion detectionestimates). In this regard, for example, with the appropriate training,the GAN is configured to change at least one general attribute of animage.

In FIG. 2, for instance, the processing module 240B is configured totrain the GAN model to transform extracted object data intosensor-fusion detection estimates. Moreover, the processing module 240Btrains the GAN model to perform these transformations directly inresponse to object data without the direct assistance or execution of asensor system, a perception system, or a sensor-fusion system. In thisregard, the processing module 240B, via the GAN, generates realisticsensor-fusion detection estimates directly from object data withouthaving to simulate sensor data (or generate sensor data estimates) foreach sensor on an individual basis. This feature is advantageous as theprocessing module 240B circumvents the burdensome process of simulatingimage data from a camera system, LIDAR data from a LIDAR system,infrared data from an infrared sensor, radar data from a radar system,and/or other sensor data from other sensors on an individual basis inorder to generate realistic input for an application system 10 (e.g.,vehicle processing system 30). This feature also overcomes thedifficulty in simulating radar data via a radar system, as thisindividual step is not performed by the processing module 240B. That is,the processing module 240B trains the GAN to generate realisticsensor-fusion detection estimates in direct response to receiving objectdata as input. Advantageously, this generation of sensor-fusiondetection estimates improves the rate and costs associated withgenerating realistic sensor-based input for the development andevaluation of one or more components of the application system 10.

In an example embodiment, the generation of sensor-fusion detectionestimates of objects include the generation of sensor-fusionrepresentations, which indicate bounds of detections corresponding tothose objects. More specifically, in FIG. 2, the processing system 240B,via the GAN, is configured to generate sensor-fusion detection estimatesof objects comprising representations of detections of those objectsthat include one or more data structures, graphical renderings, anysuitable detection agents, or any combination thereof. For instance, theprocessing system 240B is configured to train the GAN to generatesensor-fusion detection estimates that include polygonal representations(e.g., box or box-like representations as shown in FIG. 7).Alternatively, the processing system 240B, via the GAN, is configured togenerate sensor-fusion detection estimates that include completecontours (e.g., contours as shown in FIG. 8B).

In an example embodiment, the processing module 240B is configured totrain the GAN to transform the extracted object data corresponding tothe objects into sensor-fusion detection estimates, separately orcollectively. For example, the processing module 240B is configured totrain the GAN to transform object data of selected objects intosensor-fusion detection estimates on an individual basis (e.g., one at atime). Also, the processing module 240B is configured to train the GANto transform one or more sets of object data of selected objects intosensor-fusion detection estimates, simultaneously. As another example,instead of performing transformations, the processing module 240B isconfigured to train the GAN to generate sensor-fusion detectionestimates from object data of selected objects on an individual basis(e.g., one at a time). Also, the processing module 240B is configured totrain the GAN to generate sensor-fusion detection estimates from objectdata of one or more sets of object data of selected objects,simultaneously.

FIG. 3 is an example of a method 300 for training the machine learningmodel to generate the sensor-fusion detection estimates based onreal-world training data. In an example embodiment, the processingsystem 240 (e.g. the processing module 240B) is configured to performthe method shown in FIG. 3. In an example embodiment, the method 300includes at least step 302, step 304, step 306, step 308, and step 310.In addition, the method can also include steps 312 and 314.

At step 302, in an example embodiment, the processing system 240 isconfigured to obtain training data. For instance, as shown in FIG. 2,the training data includes real-world sensor-fusion detections ofobjects and corresponding annotations. The annotations are valid labelsthat identify the real-world sensor-fusion detections in relation to thecorresponding real-world objects that they represent. In this example,for instance, the annotations are input and verified by skilled humans.Upon obtaining this training data, the processing system 240 isconfigured to proceed to step 304.

At step 304, in an example embodiment, the processing system 240 isconfigured to train the neural network to generate realisticsensor-fusion detection estimates. The processing system 240 isconfigured to train the neural network (e.g., at least one GAN model)based on training data, which includes at least real-world sensor-fusiondetections of objects and corresponding annotations. In an exampleembodiment, the training includes steps 306, 308, and 310. In addition,the training includes determining whether or not this training phase iscomplete, as shown at step 312. Also, the training can include othersteps, which are not shown in FIG. 3 provided that the training resultsin a trained neural network model, which is configured to generaterealistic sensor-fusion detection estimates as described herein.

At step 306, in an example embodiment, the processing system 240 isconfigured to generate sensor-fusion detection estimates via at leastone machine-learning model. In an example embodiment, themachine-learning model includes a GAN model. In this regard, uponreceiving the training data, the processing system 240 is configured togenerate sensor-fusion detection estimates via the GAN model. In anexample embodiment, a sensor-fusion detection estimate of an objectprovides a representation that indicates the general bounds ofsensor-fusion data that is identified as that object. Non-limitingexamples of these representations include data structures, graphicalrenderings, any suitable detection agents, or any combination thereof.For instance, the processing system 240 is configured to generatesensor-fusion detection estimates for objects that include polygonalrepresentations, which comprise data structures with polygon data (e.g.,coordinate values) and/or graphical renderings of the polygon data thatindicate the polygonal bounds of detections amongst the sensor-fusiondata for those objects. Upon generating sensor-fusion detectionestimates for objects, the processing system 240 is configured toproceed to step 308.

At step 308, in an example embodiment, the processing system 240 isconfigured to compare the sensor-fusion detection estimates with thereal-world sensor-fusion detections. In this regard, the processingsystem 240 is configured to determine discrepancies between thesensor-fusion detection estimates of objects and the real-worldsensor-fusion detections of those same objects. For example, theprocessing system 240 is configured to perform at least one differencecalculation or loss calculation based on a comparison between asensor-fusion detection estimate and a real-world sensor-fusiondetection. This feature is advantageous in enabling the processingsystem 240 to fine-tune the GAN model such that a subsequent iterationof sensor-fusion detection estimates are more realistic and more attunedto the real-world sensor-fusion detections than the current iteration ofsensor-fusion detection estimates. Upon performing this comparison, theprocessing system 240 is configured to proceed to step 310.

At step 310, in an example embodiment, the processing system 240 isconfigured to update the neural network. More specifically, theprocessing system 240 is configured to update the model parameters basedon comparison metrics obtained from the comparison, which is performedat step 308. For example, the processing system 240 is configured toimprove the trained GAN model based on results of one or more differencecalculations or loss calculations. Upon performing this update, theprocessing system 240 is configured to proceed to step 306 to furthertrain the GAN model in accordance with the updated model parameters upondetermining that the training phase is not complete at step 312.Alternatively, the processing system is configured to end this trainingphase at step 314 upon determining that this training phase issufficient and/or complete at step 312.

At step 312, in an example embodiment, the processing system 240 isconfigured to determine whether or not this training phase is complete.In an example embodiment, for instance, the processing system 240 isconfigured to determine that the training phase is complete when thecomparison metrics are within certain thresholds. In an exampleembodiment, the processing system 240 is configured to determine thatthe training phase is complete upon determining that the neural network(e.g., at least one GAN model) has been trained with a predeterminedamount of training data (or a sufficient amount of training data). In anexample embodiment, the training phase is determined to be sufficientand/or complete when accurate and reliable sensor-fusion detectionestimates are generated by the processing system 240 via the GAN model.In an example embodiment, the processing system 240 is configured todetermine that the training phase is complete upon receiving anotification that the training phase is complete.

At step 314, in an example embodiment, the processing system 240 isconfigured to end this training phase. In an example embodiment, uponcompleting this training phase, the neural network is deployable foruse. For example, in FIG. 1, the simulation system 100 and/or processingsystem 110 is configured to obtain at least one trained neural networkmodel (e.g., trained GAN model) from the memory system 230 of FIG. 2.Also, in an example embodiment, as shown in FIG. 1, the simulationsystem 100 is configured to employ the trained GAN model to generate orassist in the generation of realistic sensor-fusion detection estimatesfor simulations.

FIG. 4 is an example of a method 400 for generating simulations withrealistic sensor-fusion detection estimates of objects according to anexample embodiment. In an example embodiment, the simulation system 100,particularly the processing system 110, is configured to perform atleast each of the steps shown in FIG. 4. As aforementioned, once thesimulations are generated, then the simulation system 100 is configuredto provide these simulations to the application system 10, therebyenabling cost-effective development and evaluation of one or morecomponents of the application system 10.

At step 402, in an example embodiment, the processing system 110 isconfigured to obtain simulation data, which includes a simulationprogram with at least one visualization of at least one simulated scene.In an example embodiment, for instance, the visualization of the sceneincludes at least a three-channel pixel image. More specifically, as anon-limiting example, a three-channel pixel image is configured toinclude, for example, in any order, a first channel with a location ofthe vehicle 20, a second channel with locations of simulation objects(e.g., dynamic simulation objects), and a third channel with map data.In this case, the map data includes information from a high-definitionmap. The use of a three-channel pixel image in which the simulationobjects are provided in a distinct channel is advantageous in enablingefficient handling of the simulation objects. Also, in an exampleembodiment, each visualization includes a respective scene, scenario,and/or condition (e.g., snow, rain, etc.) from any suitable view (e.g.,top view, side view, etc.). For example, a visualization of the scenewith a two-dimensional (2D) top view of template versions of simulationobjects within a region is relatively convenient and easy to generatecompared to other views while also being relatively convenient and easyfor the processing system 110 to handle.

In an example embodiment, the simulation objects are representations ofreal-world objects (e.g., pedestrians, buildings, animals, vehicles,etc.), which may be encountered in a region of that environment. In anexample embodiment, these representations are model versions or templateversions (e.g. non-sensor-based versions) of these real-world objects,thereby not being accurate or realistic input for the vehicle processingsystem 30 compared to real-world detections, which are captured bysensors 220A of the vehicle 220 during a real-world drive. In an exampleembodiment, the template version include at least various attribute dataof an object as defined within the simulation. For example, theattribute data can include size data, shape data, location data, otherfeatures of an object, any suitable data, or any combination thereof. Inthis regard, the generation of visualizations of scenes that includetemplate versions of simulation objects is advantageous as this allowsvarious scenarios and scenes to be generated at a fast and inexpensiverate since these visualizations can be developed without having toaccount for how various sensors would detect these simulation objects inthe environment. As a non-limiting example, for instance, in FIG. 8A,the simulation data includes a visualization 800A, which is a 2D topview of a geographical region, which includes roads near an intersectionalong with template versions of various objects, such as stationaryobjects (e.g., buildings, trees, fixed road features, lane-markings,etc.) and dynamic objects (e.g. other vehicles, pedestrians, etc.). Uponobtaining the simulation data, the processing system 110 performs step404.

At step 404, in an example embodiment, the processing system 110 isconfigured to generate a sensor-fusion detection estimate for eachsimulation object. For example, in response to receiving the simulationdata (e.g., a visualization of a scene) as input, the processing system110 is configured to implement or employ at least one trained GAN modelto generate sensor-fusion representations and/or sensor-fusion detectionestimates in direct response to the input. More specifically, theprocessing system 110 is configured to implement a method to providesimulations with sensor-fusion detection estimates. In this regard, forinstance, two different methods are discussed below in which a firstmethod involves image-to-image transformation and the second methodinvolves image-to-contour transformation.

As a first method, in an example embodiment, the processing system 110together with the trained GAN model is configured to perform image toimage transformation such that a visualization of a scene with at leastone simulation object is transformed into an estimate of a sensor-fusionoccupancy map with sensor-fusion representations of the simulationobject. In this case, the estimate of the sensor-fusion occupancy map isa machine-learning based representation of a real-world sensor-fusionoccupancy map that a mobile machine (e.g., vehicle 20) would generateduring a real-world drive. For example, the processing system 110 isconfigured to obtain simulation data with at least one visualization ofat least one scene that includes a three-channel image or any suitableimage. More specifically, in an example embodiment, the processingsystem 110, via the trained GAN model, is configured to transform thevisualization of a scene with simulation objects into a sensor-fusionoccupancy map (e.g., 512×512 pixel image or any suitable image) withcorresponding sensor-fusion representations of those simulation objects.As a non-limiting example, for instance, the sensor-fusion occupancy mapincludes sensor-fusion representations with one or more pixels havingpixel data (e.g., pixel colors) that indicates object occupancy (and/orprobability data relating to object occupancy for each pixel). In thisregard, for example, upon obtaining a visualization of a scene (e.g.,image 800A of FIG. 8A), the processing system 110 is configured togenerate an estimate of a sensor-fusion occupancy map that is similar toimage 800B of FIG. 8B in that sensor-fusion representations correspondto detections of simulation objects in a realistic manner based on thescenario, but different than the image 800B in that the sensor-fusionoccupancy map does not yet include object contour data for thecorresponding simulation objects as shown in FIG. 8B.

Also, for this first method, after generating the sensor-fusionoccupancy map with sensor-fusion representations corresponding tosimulation objects, the processing system 110 is configured to performobject contour extraction. More specifically, for example, theprocessing system 110 is configured to obtain object information (e.g.,size and shape data) from the occupancy map. In addition, the processingsystem 110 is configured to identify pixels with an object indicator oran object marker as being sensor-fusion data that corresponds to asimulation object. For example, the processing system 110 is configuredto identify one or more pixel colors (e.g., dark pixel colors) as havinga relatively high probability of being sensor-fusion data thatrepresents a corresponding simulation object and cluster those pixelstogether. Upon identifying pixels of a sensor-fusion representation thatcorresponds to a simulation object, the processing system 110 is thenconfigured to obtain an outline of the clusters of pixels ofsensor-fusion data that correspond to the simulation objects and presentthe outline as object contour data. In an example embodiment, theprocessing system 110 is configured to provide the object contour dataas a sensor-fusion detection estimate for the corresponding simulationobject.

As a second method, in an example embodiment, the processing system 110is configured to receive a visualization of a scene with at least onesimulation object. For instance, as a non-limiting example of input, theprocessing system 110, via the at least one trained GAN model, isconfigured to receive a visualization of a scene that includes at leastone simulation object in a center region with a sufficient amount ofcontextual information regarding the environment. As another example ofinput, the processing system 110, via the at least one trained GANmodel, is configured to receive a visualization of a scene that includesat least one simulation object along with additional informationprovided in a data vector. For instance, in a non-limiting example, thedata vector is configured to include additional information relating tothe simulation object such as a distance from that simulation object tothe vehicle 10, information regarding other vehicles between thesimulation object and the vehicle 10, environment condition (e.g.,weather information), other relevant information, or any combinationthereof.

Also, for this second method, upon receiving simulation data as input,the processing system 110 via the trained GAN model is configured totransform each simulation object from the visualization directly into acorresponding sensor-fusion detection estimate, which includes objectcontour data. In this regard, for instance, the object contour dataincludes a suitable number of points that identify an estimate of anoutline of bounds of the sensor-fusion data that represents thatsimulation object. For instance, as a non-limiting example, theprocessing system 110 is configured to generate object contour data,which is scaled in meters for 2D space and includes the followingpoints: (1.2, 0.8), (1.22, 0.6), (2.11, 0.46), (2.22, 0.50), (2.41,0.65), and (1.83, 0.70). In this regard, the object contour dataadvantageously provides an indication of estimates of bounds ofsensor-fusion data that represent object detections as would be detectedby a sensor-fusion system in an efficient manner with relatively lowmemory consumption.

For the first method or the second method associated with step 404, theprocessing system 110 is configured to generate or provide anappropriate sensor-fusion detection estimate for each simulation objectin accordance with how a real-world sensor-fusion system would detectsuch an object in that scene. In an example embodiment, the processingsystem 110 is configured to generate each sensor-fusion detectionestimate for each simulation object on an individual basis. As anotherexample, the processing system 110 is configured to generate or providesensor-fusion detection estimates for one or more sets of simulationobjects at the same time. As yet another example, the processing system110 is configured to generate or provide sensor-fusion detectionestimates for all of the simulation objects simultaneously. In anexample embodiment, the processing system 110 is configured to provideobject contour data as sensor-fusion detection estimates of simulationobjects. After obtaining one or more sensor-fusion detection estimates,the processing system 110 proceeds to step 406.

At step 406, in an example embodiment, the processing system 110 isconfigured to apply the sensor-fusion detection estimates to at leastone simulation step. More specifically, for example, the processingsystem 110 is configured to generate a simulation scene, which includesat least one visualization of at least one scene with at least onesensor-fusion detection estimate in place of the template of thesimulation object. In this regard, the simulation may include thevisualization of the scene with a transformation of the extracted objectdata into sensor-fusion detection estimates or a newly generatedvisualization of the scene with sensor-fusion detection estimates inplace of the extracted object data. Upon applying or including thesensor-fusion detection estimates as a part of the simulation, theprocessing system 110 is configured to proceed to step 408.

At step 408, in an example embodiment, the processing system 110 isconfigured to transmit the simulation to the application system 10 sothat the simulation is executed on one or more components of theapplication system 10, such as the vehicle processing system 30. Forexample, the processing system 110 is configured to provide thissimulation to a trajectory system, a planning system, a motion controlsystem, a prediction system, a vehicle guidance system, any suitablesystem, or any combination thereof. More specifically, for instance, theprocessing system 110 is configured to provide the simulations with thesensor-fusion detection estimates to a planning system or convert thesensor-fusion detection estimates into a different data structure or asimplified representation for faster processing. With this realisticinput, the application system 10 is provided with information, such asfeedback data and/or performance data, which enables one or morecomponents of the application system 10 to be evaluated and improvedbased on simulations involving various scenarios in a cost-effectivemanner.

FIGS. 5A and 5B are conceptual diagrams relating to sensing anenvironment with respect to a sensor system according to an exampleembodiment. In this regard, FIG. 5A is a conceptual diagram of areal-world object 505 in relation to a sensor set, associated withrespect to vehicle 220 during the data collection process 210. Morespecifically, FIG. 5A shows an object 505, which is detectable by asensor set, which includes at least a first sensor 220A₁ (e.g., LIDARsensor) with a first sensing view designated between lines 502 and asecond sensor 220A₂ (e.g., camera sensor) with a second sensing viewdesignated between lines 504. In this case, the first sensor 220A₁ andthe second sensor 220A₂ have overlapping sensing ranges in which theobject 505 is positioned. Meanwhile, FIG. 5B is a conceptual diagram ofa sensor-fusion detection 508 of the object of FIG. 5A based on thissensor set. As shown in FIG. 5B, the sensor-fusion detection 508includes an accurate representation of a first side 505A and a secondside 505B of the object 505, but includes an inaccurate representationof a third side 505C and a fourth side 505D of the object 505. In thisnon-limiting scenario, the discrepancy between the actual object 505 andits sensor-fusion detection 508 may be due to the sensors, occlusion,positioning issues, any other issue, or any combination thereof. Asdemonstrated by FIGS. 5A and 5B, since the sensor-fusion detection 508of the object 505 does not produce an exact match to the actual object505 itself, the use of simulation data that includes sensor-basedrepresentations that matches or more closely resembles an actualsensor-fusion detection 508 of the object 505 is advantageous insimulating realistic sensor-based input that the vehicle 220 wouldreceive during a real-world drive.

FIGS. 6A and 6B are conceptual diagrams relating to sensing anenvironment that includes two objects in relation to a sensor system. Inthis example, as shown in FIG. 6A, both the first object 604 and thesecond object 605 are in a sensing range of at least one sensor 220A.Meanwhile, FIG. 6B is a conceptual diagram of a sensor-fusion detection608 of the first object 604 and the second object 605 based at least onsensor data of the sensor 220A. As shown in FIG. 6B, the sensor-fusiondetection 608 includes an accurate representation of a first side 604Aand a second side 604B of the first object 604, but includes aninaccurate representation of the third side 604C and fourth side 604D ofthe first object 604. In addition, as shown in FIG. 6B, the sensor 220Adoes not detect the second object 605 at least since the first object604 occludes the sensor 220A from detecting the second object 606. Asdemonstrated by FIGS. 6A and 6B, there are a number of discrepanciesbetween the actual scene, which includes the first object 604 and thesecond object 605, and its sensor-based representation, which includesthe sensor-fusion detection 608. These discrepancies highlight theadvantage of using simulation data with sensor-based data that matchesor more closely resembles an actual sensor-fusion detection 608 of bothobject 604 and object 605, which the vehicle 220 would receive from itssensor system during a real-world drive.

FIG. 7 is a conceptual diagram that shows a superimposition 700 ofreal-world objects 702 in relation to real-world sensor-fusiondetections 704 of those same objects according to an example embodiment.In addition, the superimposition 700 also includes raw sensor data 706(e.g. LIDAR data). Also, as a reference, the superimposition 700includes a visualization of a vehicle 708, which includes a sensorsystem that is sensing an environment and generating this raw sensordata 706. More specifically, in FIG. 7, the real-world objects 702 arerepresented by polygons of a first color (e.g. blue) and the real-worldsensor-fusion detections 704 are represented by polygons of a secondcolor (e.g., red). In addition, FIG. 7 also includes some examples ofsensor-fusion detection estimates 710 (or object contour data 710). Asshown by this superimposition 700, there are differences between thegeneral bounds of the real objects 702 and the general bounds of thereal-world sensor-fusion detections 704. These differences show theadvantage of using simulation data that more closely matches thereal-world sensor-fusion detections 704 in the development of one ormore components of an application system 10 as unrealisticrepresentations and even minor differences may result in erroneoustechnological development.

FIGS. 8A and 8B illustrate non-limiting examples of images withdifferent visualizations of top-views of a geographic region accordingto an example embodiment. Also, for discussion purposes, the location802 of a vehicle, which includes various sensors, is shown in FIGS. 8Aand 8B. More specifically, FIG. 8A illustrates a first image 800A, whichis a 2D top-view visualization of the geographic region. In this case,the first image 800A refers to an image with relatively well-definedobjects, such as a visualization of a scene with simulated objects or areal-world image with annotated objects. The geographic region includesa number of real and detectable objects. For instance, in thisnon-limiting example, this geographic region includes a number of lanes,which are defined by lane markings (e.g., lane-markings 804A, 806A,808A, 810A 812A, 814A, 816A, and 818A) and other markings (e.g., stopmarker 820A). In addition, this geographic region includes a number ofbuildings (e.g., a commercial building 822A, a first residential house824A, a second residential house 826A, a third residential house 828A,and a fourth residential house 830A). This geographic region alsoincludes at least one natural, detectable object (e.g. tree 832A). Also,this geographic region includes a number of mobile objects, e.g., fiveother vehicles (e.g., vehicles 834A, 836A, 838A, 840A, and 842A)traveling in a first direction, three other vehicles (e.g., vehicles844A, 846A, and 848A) traveling in a second direction, and two othervehicles (e.g., vehicles 850A and 852A) traveling in a third direction.

FIG. 8B is a diagram of a non-limiting example of a second image 800B,which corresponds to the first image 800A of FIG. 8A according to anexample embodiment. In this case, the second image 800B is a top-viewvisualization of the geographic region, which includes sensor-fusionbased objects. In this regard, the second image 800B represents adisplay of the geographic region with sensor-based representations(e.g., real-world sensor-fusion detections or sensor-fusion detectionestimates) of objects. As shown, based on its location 802, the vehicleis enabled, via its various sensors, to provide sensor-fusion buildingdetection 822B for most of the commercial building 822A. In addition,the vehicle is enabled, via its sensors, to provide sensor-fusion homedetection 824B and 825B for some parts of two of the residential homes824A and 825A, but is unable to detect the other two residential homes828A and 830A. In addition, the vehicle is enabled, via its plurality ofsensors and other related data (e.g., map data), to generate indicationsof lane-markings 804B, 806B, 808B, 810B 812B, 814B, 816B, and 818B andan indication of stop marker 820B except for some parts of the laneswithin the intersection. Also, a sensor-fusion tree detection 832B isgenerated for some parts of the tree 832A. In addition, thesensor-fusion mobile object detections 836B and 846B indicate theobtainment of sensor-based data of varied levels of mobile objects, suchas most parts of vehicle 836A, minor parts of vehicle 846B, and no partsof vehicle 834A.

As described herein, the simulation system 100 provides a number ofadvantageous features, as well as benefits. For example, when applied tothe development of an autonomous or a semi-autonomous vehicle 20, thesimulation system 100 is configured to provide simulations as realisticinput to one or more components of the vehicle 20. For example, thesimulation system 100 is configured to provide simulations to atrajectory system, a planning system, a motion control system, aprediction system, a vehicle guidance system, any suitable system, orany combination thereof. Also, by providing simulations withsensor-fusion detection estimates, which are the same as or remarkablysimilar to real-world sensor-fusion detections that are obtained duringreal-world drives, the simulation system 100 is configured to contributeto the development of an autonomous or a semi-autonomous vehicle 20 in asafe and cost-effective manner while also reducing safety-criticalbehavior.

In addition, the simulation system 100 employs a trainedmachine-learning model, which is advantageously configured forsensor-fusion detection estimation. More specifically, as discussedabove, the simulation system 100 includes a trained machine learningmodel (e.g., GAN. DNN, etc.), which is configured to generatesensor-fusion representations and/or sensor-fusion detection estimatesin accordance with how a mobile machine, such as a vehicle 20, wouldprovide such data via a sensor-fusion system during a real-world drive.Although the sensor-fusion detections of objects via a mobile machinevaries in accordance with various factors (e.g., distance, sensorlocations, occlusion, size, other parameters, or any combinationthereof), the trained GAN model is nevertheless trained to generate orpredominately contribute to the generation of realistic sensor-fusiondetection estimates of these objects in accordance with real-use cases,thereby accounting for these various factors and providing realisticsimulations to one or more components of the application system 10.

Furthermore, the simulation system 100 is configured to provide variousrepresentations and transformations via the same trainedmachine-learning model (e.g. trained GAN model), thereby improving therobustness of the simulation system 100 and its evaluation. Moreover,the simulation system 100 is configured to generate a large number ofsimulations by transforming or generating sensor-fusion representationsand/or sensor-fusion detection estimates in place of object data invarious scenarios in an efficient and effective manner, thereby leadingto faster development of a safer system for an autonomous orsemi-autonomous vehicle 20.

That is, the above description is intended to be illustrative, and notrestrictive, and provided in the context of a particular application andits requirements. Those skilled in the art can appreciate from theforegoing description that the present invention may be implemented in avariety of forms, and that the various embodiments may be implementedalone or in combination. Therefore, while the embodiments of the presentinvention have been described in connection with particular examplesthereof, the general principles defined herein may be applied to otherembodiments and applications without departing from the spirit and scopeof the described embodiments, and the true scope of the embodimentsand/or methods of the present invention are not limited to theembodiments shown and described, since various modifications will becomeapparent to the skilled practitioner upon a study of the drawings,specification, and following claims. For example, components andfunctionality may be separated or combined differently than in themanner of the various described embodiments, and may be described usingdifferent terminology. These and other variations, modifications,additions, and improvements may fall within the scope of the disclosureas defined in the claims that follow.

What is claimed is:
 1. A system for generating a realistic simulation,the system comprising: a non-transitory computer readable mediumincluding a visualization of a scene that includes a template of asimulation object within a region; a processing system communicativelyconnected to the non-transitory computer readable medium, the processingsystem including at least one processing device and being configured toexecute computer readable data that implements a method that includes:generating a sensor-fusion representation of the template upon receivingthe visualization as input; and generating a simulation of the scenewith a sensor-fusion detection estimate of the simulation object insteadof the template within the region, the sensor-fusion detection estimateincluding object contour data indicating bounds of the sensor-fusionrepresentation.
 2. The system of claim 1, wherein: the processing systemis configured to generate the sensor-fusion representation of thesimulation object via a trained machine-learning model; and the trainedmachine-learning model is trained with (i) sensor-fusion data obtainedfrom sensors during real-world drives of vehicles and (ii) annotationsidentifying object contour data of detections of objects from among thesensor-fusion data.
 3. The system of claim 1, wherein: the processingsystem is configured to generate a sensor-fusion occupancy map directlyfrom the visualization via a trained generative adversarial network(GAN) model in which the sensor-fusion representation is a part of thesensor-fusion occupancy map; and the processing system is configured toextract the object contour data based on occupancy criteria of thesensor-fusion occupancy map and provide the object contour data as thesensor-fusion detection estimate.
 4. The system of claim 1, wherein thevisualization includes a multi-channel pixel image in which thesimulation object is in a channel for simulation objects that isdistinct from the other channels.
 5. The system of claim 1, wherein: theprocessing system is configured to receive location data of thesimulation object as input along with the visualization to generate thesensor-fusion representation of the simulation object via a trainedgenerative adversarial network (GAN) model; and the sensor-fusionrepresentation includes object contour data that serves as thesensor-fusion detection estimate.
 6. The system of claim 1, wherein thevisualization includes a two-dimensional top view of the simulationobject within the region.
 7. The system of claim 1, wherein thesensor-fusion representation is based on a plurality of sensorsincluding at least a camera, a satellite-based sensor, a light detectionand ranging sensor, and a radar sensor.
 8. A computer-implemented methodcomprising: obtaining, via a processing system with at least onecomputer processor, a visualization of a scene that includes a templateof a simulation object within a region; generating, via the processingsystem, a sensor-fusion representation of the template upon receivingthe visualization as input; and generating, via the processing system, asimulation of the scene with a sensor-fusion detection estimate of thesimulation object instead of the template within the region, thesensor-fusion detection estimate including object contour dataindicating bounds of the sensor-fusion representation.
 9. The method ofclaim 8, wherein the sensor-fusion representation of the simulationobject is generated via employing a trained machine-learning model; andthe trained machine-learning model is trained with at least (i)sensor-fusion data obtained from sensors during real-world drives ofvehicles and (ii) annotations identifying object contour data ofdetections of objects from among the sensor-fusion data.
 10. The methodof claim 8, wherein: the step of generating the sensor-fusionrepresentation of the template upon receiving the visualization as inputincludes generating a sensor-fusion occupancy map via a trainedgenerative adversarial network (GAN) model in which the sensor-fusionrepresentation is generated as a part of the sensor-fusion occupancymap; the object contour data is extracted based on occupancy criteria ofthe sensor-fusion occupancy map; and the object contour data is providedas the sensor-fusion detection estimate.
 11. The method of claim 8,wherein the visualization includes a multi-channel pixel image in whichthe simulation object is in a channel for simulation objects that isdistinct from the other channels.
 12. The method of claim 8, furthercomprising: obtaining location data of the simulation object as inputalong with the visualization to generate the sensor-fusionrepresentation of the simulation object via a trained generativeadversarial network (GAN) model; wherein: the sensor-fusionrepresentation includes object contour data that serves as thesensor-fusion detection estimate.
 13. The method of claim 8, wherein thevisualization includes a two-dimensional top view of the simulationobject within the region.
 14. The method of claim 8, wherein thesensor-fusion representation is based on a plurality of sensorsincluding at least a camera, a satellite-based sensor, a light detectionand ranging sensor, and a radar sensor.
 15. A non-transitory computerreadable medium with computer-readable data that, when executed by acomputer processor, is configured to implement a method comprising:obtaining visualization of a scene that includes a template of asimulation object within a region; generating a sensor-fusionrepresentation of the template upon receiving the visualization asinput; and generating a simulation of the scene with a sensor-fusiondetection estimate of the simulation object instead of the templatewithin the region, the sensor-fusion detection estimate including objectcontour data indicating bounds of the sensor-fusion representation. 16.The computer readable medium of claim 15, wherein: the sensor-fusionrepresentation of the simulation object is generated via a trainedmachine-learning model; and the trained machine-learning model istrained with (i) sensor-fusion data obtained from sensors duringreal-world drives of vehicles and (ii) annotations identifying objectcontour data of detections of objects from among the sensor-fusion data.17. The computer readable medium of claim 15, wherein the methodincludes: generating a sensor-fusion occupancy map via a trainedgenerative adversarial network (GAN) model in which the sensor-fusionrepresentation is a part of the sensor-fusion occupancy map; extractingobject contour data based on occupancy criteria of the sensor-fusionoccupancy map; and providing the object contour data as thesensor-fusion detection estimate.
 18. The computer readable medium ofclaim 15, wherein the visualization includes a multi-channel pixel imagein which the simulation object is in a channel for simulation objectsthat is distinct from the other channels.
 19. The computer readablemedium of claim 15, wherein the method includes: obtaining location dataof the simulation object as input along with the visualization togenerate the sensor-fusion representation of the simulation object via atrained generative adversarial network (GAN) model; and thesensor-fusion representation includes object contour data as thesensor-fusion detection estimate.
 20. The computer readable medium ofclaim 15, wherein the visualization is a two-dimensional top view of thesimulation object within the region.